-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Apply effects to otherwise
edge in dataflow analysis
#142707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rustbot has assigned @petrochenkov. Use |
This comment has been minimized.
This comment has been minimized.
r? compiler |
hm. |
This comment has been minimized.
This comment has been minimized.
Update `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` for `otherwise` <!-- homu-ignore:start --> <!-- If this PR is related to an unstable feature or an otherwise tracked effort, please link to the relevant tracking issue here. If you don't know of a related tracking issue or there are none, feel free to ignore this. This PR will get automatically assigned to a reviewer. In case you would like a specific user to review your work, you can assign it to them by using r? <reviewer name> --> <!-- homu-ignore:end --> This allows `ElaborateDrops` to remove drops when a `match` wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` data for a block reached through an `otherwise` edge. This appears to fix #142705, but I don't know for sure if it's actually correct (are there cases where this would be wrong and break things?). I also haven't tested that it actually improves compile times, or machine code output outside of the examples in the issue.
This comment has been minimized.
This comment has been minimized.
8d837ad
to
10e1a9d
Compare
Finished benchmarking commit (e3d7e41): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.2%, secondary 2.8%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 0.9%, secondary -5.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.1%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 691.904s -> 692.572s (0.10%) |
Hmm that's unfortunate. It might help to avoid paying the cost of the extra clone and checks when it won't make a difference, like in the trivial What's more concerning is runtime benchmarks being worse. There are only two, and I don't know how noisy they are, but 8% wall time increase for css-parse-fb is pretty bad. There are also some compile time benchmarks whose graphs show significant increases in wall time spent in LLVM. Does this just not play well with later optimizations? |
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
The last commit makes it so that there's an option in the |
the runtime benchmarks are unfortunately very noisy |
move_data: &MoveData<'tcx>, | ||
enum_place: mir::Place<'tcx>, | ||
active_variant: VariantIdx, | ||
mut handle_inactive_variant: impl FnMut(MovePathIndex), | ||
mut handle_active_variant: Option<impl FnMut(MovePathIndex)>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it will be better, perf-wise to make this not take an option but just use a no-op for callers that do not want to handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's definitely possible, since I think in that case on_all_children_bits
could be monomorphized into an empty function. I made the change and it does reduce stage2 librustc_driver.so
size a bit.
this part of MIR is a little above my level of confidence for review, so r? compiler |
r? wg-mir-opt |
In the last commit I fixed the test file and incorporated the suggestions. I used a |
Yes, it should have been called on the predecessor which contains the switch. I haven't followed all the discussion, but if removing unused switch int handling in the backward direction makes anything easier that seems perfectly fine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please squash your commits into one and update the PR description, remembering to use https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword.
tests/mir-opt/otherwise_drops.rs
Outdated
|
||
// Ensures there are no drops for the wildcard match arm. | ||
|
||
// EMIT_MIR otherwise_drops.result_ok.ElaborateDrops.after.mir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you want to add //@ compile-flags: -Cpanic=abort
or // EMIT_MIR_FOR_EACH_PANIC_STRATEGY
. I prefer the first one here.
tests/mir-opt/otherwise_drops.rs
Outdated
fn result_ok(result: Result<String, ()>) -> Option<String> { | ||
// CHECK-LABEL: fn result_ok( | ||
// CHECK-NOT: drop | ||
// CHECK: return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// CHECK: return |
The return statement isn't the final line of this function.
tests/mir-opt/otherwise_drops.rs
Outdated
fn main() { | ||
result_ok(Err(())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn main() { | |
result_ok(Err(())); | |
} |
Removing this function makes file check easier.
@bors2 try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Apply effects to `otherwise` edge in dataflow analysis <!-- homu-ignore:start --> <!-- If this PR is related to an unstable feature or an otherwise tracked effort, please link to the relevant tracking issue here. If you don't know of a related tracking issue or there are none, feel free to ignore this. This PR will get automatically assigned to a reviewer. In case you would like a specific user to review your work, you can assign it to them by using r? <reviewer name> --> <!-- homu-ignore:end --> This allows `ElaborateDrops` to remove drops when a `match` wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update the `MaybeUninitializedPlaces` and `MaybeInitializedPlaces` data for a block reached through an `otherwise` edge. This appears to fix #142705, but I don't know for sure if it's actually correct (are there cases where this would be wrong and break things?). I also haven't tested that it actually improves compile times, or machine code output outside of the examples in the issue.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (237f435): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.7%, secondary 3.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary 1.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.1%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 467.247s -> 463.296s (-0.85%) |
881c62f
to
d9dd6ee
Compare
Squashed and force pushed, the only changes since the last commit are addressing the test file review comments. Regarding the perf run results, there are 8 cases this could affect:
The first perf run handled The first has the strongest results, with a lot of improvements and a few regressions. (Also a runtime test regression, which seems to follow a step function where it fluctuates between either exactly 0 or exactly 1.9% relative to the baseline over the past 30 days and went back to 1.9% here, not sure what that means) The second still shows an overall improvement in compile times but applies to much fewer tests. It also has a different runtime regression with the same binary/step pattern. The third has more regressions and fewer improvements, and an overall regression in compile times for all benchmarks (mostly due to the secondary benchmarks especially Should I just enable the change for all 8 cases again? Or make it as limited as possible (number 7 alone is enough to fix the wildcard problem), or try other combinations? |
d9dd6ee
to
c7ef03a
Compare
These can be subsequent PRs. Thanks! |
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 34097a3 (parent) -> d350797 (this PR) Test differencesShow 12 test diffsStage 1
Stage 2
Additionally, 10 doctest diffs were found. These are ignored, as they are noisy. Job group index
Test dashboardRun cargo run --manifest-path src/ci/citool/Cargo.toml -- \
test-dashboard d350797b7e1a5b6952f5035cbad2a21613b32f6c --output-dir test-dashboard And then open Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (d350797): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 2.7%, secondary 0.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -0.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.1%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 467.127s -> 465.667s (-0.31%) |
More improvements than regressions, and improvements are mostly in primary benchmarks, while regressions are in secondary benchmarks. This comment mentions a plan for possible future improvements. @rustbot label: +perf-regression-triaged |
This allows
ElaborateDrops
to remove drops when amatch
wildcard arm covers multiple no-Drop enum variants. It modifies dataflow analysis to update theMaybeUninitializedPlaces
andMaybeInitializedPlaces
data for a block reached through anotherwise
edge.Fixes #142705.